Search CORE

1,889 research outputs found

About Metrics for Clone Detection

Author: Lavoie Thierry
Merlo Ettore
Publication venue: European Association of Software Science and Technology
Publication date: 26/07/2014
Field of study

Clone detectors rely on the concept of similarity and dissimilarity measures to identify cloned fragments. The choice of specific distance function in a clone detector is arbitrary up to some extent. However, with a deeper knowledge of similarity measures, we can condition this choice to have some properties that can help improve scalability and quality of tools. This paper presents some interesting results, insights and questions about similarity and dissimilarity measures, including a somehow counter-intuitive result on the cosine distance

Electronic Communications of the EASST (European Association of Software Science and Technology)

06301 Abstracts Collection -- Duplication, Redundancy, and Similarity in Software

Author: Koschke Rainer
Merlo Ettore
Walenstein Andrew
Publication venue: Dagstuhl Seminar Proceedings. 06301 - Duplication, Redundancy, and Similarity in Software
Publication date: 01/01/2007
Field of study

From 23.07.06 to 26.07.06, the Dagstuhl Seminar 06301 ``Duplication, Redundancy, and Similarity in Software\u27\u27 was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available

Dagstuhl Research Online Publication Server

Comparison and Evaluation of Clone Detection Tools

Author: Ettore Merlo
Giuliano Antoniol
Jens Krinke
Rainer Koschke
Stefan Bellon
Publication venue
Publication date: 01/01/2007
Field of study

Many techniques for detecting duplicated source code (software clones) have been proposed in the past. However, it is not yet clear how these techniques compare in terms of recall and precision as well as space and time requirements. This paper presents an experiment that evaluates six clone detectors based on eight large C and Java programs (altogether almost 850 KLOC). Their clone candidates were evaluated by one of the authors as an independent third party. The selected techniques cover the whole spectrum of the state-of-the-art in clone detection. The techniques work on text, lexical and syntactic information, software metrics, and program dependency graphs

CiteSeerX

UCL Discovery

PolyPublie

Levenshtein edit distance-based type III clone detection using metric trees

Author: Lavoie Thierry M.
Merlo Ettore
Publication venue
Publication date: 01/02/2011
Field of study

This paper presents an original technique for clone detection with metric trees using Levenshtein distance as the metric defined between two code fragments. This approach achieves a faster empirical performance. The resulting clones may be found with varying thresholds allowing type 3 clone detection. Experimental results of metric trees performance as well as clone detection statistics on an open source system are presented and give promising perspectives

PolyPublie

Detection of redundant clone relations based on clone subsumption

Author: Lavoie Thierry M.
Merlo Ettore
Publication venue
Publication date: 01/04/2009
Field of study

Clone detection has been presented in the literature at different levels of fragment granularity from functions, to syntactic blocks, to variable length strings of source code or tokens. String matching approaches, prefix and suffix trees, metrics, syntactic approaches and others can be used to compare fragments for similarity. Inclusion relations between source code lines may cause some clone relations to be redundant, when clones code fragments subsume each other. This may occur between nested blocks of source code, for example. An original method to analyze this kind of redundancy in clone relations is presented. The proposed method is based on efficiently combining clone subsumption information together with clone similarity relations on code fragments. The amount of redundancy in clone relations has been evaluated on two open source Java systems, Tomcat and Eclipse. Experimental results are presented. Execution time performance of redundancy analysis is measured and reported. Results are discussed together with further proposed research

PolyPublie

Insider threat resistant SQL-injection prevention in PHP

Author: Antoniol Giuliano
Letarte Dominic
Merlo Ettore
Publication venue
Publication date: 01/05/2006
Field of study

Web sites are either static sites, programs, or databases. Very often they are a mixture of these three aspects integrating relational databases as a back-end. Web sites require configuration and programming attention to assure security, confidentiality, and trustiness of the published information. SQL-injection attacks rely on some weak validation of textual input used to build database queries. Maliciously crafted input may threaten the confidentiality and the security policies of Web sites relying on a database to store and retrieve information. Furthermore, insiders may introduce malicious code in a Web application, code that, when triggered by some specific input, for example, would violate security policies. This paper presents an original approach that combines static analysis, dynamic analysis, and code reengineering to automatically protect applications written in PHP from both malicious input (outsider threats) and malicious code (insider threats) that carry SQLinjection attacks. The paper also reports preliminary results about experiments performed on an old SQL-injection prone version of phpBB (version 2.0.0, 37193 LOC of PHP version 4.2.2 code). Results show that our approach successfully improved phpBB-2.0.0 resistance to SQLinjection attacks

PolyPublie

A feedback based quality assessment to support open source software evolution: the GRASS case study

Author: Ettore Merlo
Giuliano Antoniol
Markus Neteler
Salah Bouktif
Publication venue
Publication date: 01/01/2006
Field of study

Abstrac

CiteSeerX

PolyPublie

Mapping features to source code in dynamically configured avionics software

Author: Gagnon Martin
Gauthier François
Merlo Ettore
Ouellet Maxime
Sozen Neset
Publication venue
Publication date: 01/02/2012
Field of study

Mapping software features to the code that implements them is an important activity for program comprehension and software reengineering. In this paper, we present a novel automated approach to locate features in source code based on static analysis and model checking. This approach focuses on dynamically configured software in which the activation of specific features is controlled by configuration variables. The main advantages of a static approach to feature location are its affordability and applicability to large systems containing hundreds of features. Our methodology is applied to an industrial Flight Management System from the avionics industry. Results show that a static approach to feature mapping is feasible and can locate complex features whose implementation is spread across multiple files and functions

PolyPublie

How to Certify Machine Learning Based Safety-critical Systems? A Systematic Literature Review

Author: An Le
Antoniol Giulio
Khomh Foutse
Laberge Gabriel
Laviolette François
Merlo Ettore
Mindom Paulina Stevia Nouwou
Nikanjam Amin
Pequignot Yann
Tambon Florian
Publication venue
Publication date: 01/12/2021
Field of study

Context: Machine Learning (ML) has been at the heart of many innovations over the past years. However, including it in so-called 'safety-critical' systems such as automotive or aeronautic has proven to be very challenging, since the shift in paradigm that ML brings completely changes traditional certification approaches. Objective: This paper aims to elucidate challenges related to the certification of ML-based safety-critical systems, as well as the solutions that are proposed in the literature to tackle them, answering the question 'How to Certify Machine Learning Based Safety-critical Systems?'. Method: We conduct a Systematic Literature Review (SLR) of research papers published between 2015 to 2020, covering topics related to the certification of ML systems. In total, we identified 217 papers covering topics considered to be the main pillars of ML certification: Robustness, Uncertainty, Explainability, Verification, Safe Reinforcement Learning, and Direct Certification. We analyzed the main trends and problems of each sub-field and provided summaries of the papers extracted. Results: The SLR results highlighted the enthusiasm of the community for this subject, as well as the lack of diversity in terms of datasets and type of models. It also emphasized the need to further develop connections between academia and industries to deepen the domain study. Finally, it also illustrated the necessity to build connections between the above mention main pillars that are for now mainly studied separately. Conclusion: We highlighted current efforts deployed to enable the certification of ML based software systems, and discuss some future research directions.Comment: 60 pages (92 pages with references and complements), submitted to a journal (Automated Software Engineering). Changes: Emphasizing difference traditional software engineering / ML approach. Adding Related Works, Threats to Validity and Complementary Materials. Adding a table listing papers reference for each section/subsection

arXiv.org e-Print Archive

PolyPublie

Predicting Web site access: an application of time series

Author: Antoniol Giuliano
Casazza Gerardo
Lucca Giuseppe A. Di
Merlo Ettore
Penta Massimiliano Di
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2001
Field of study

Crossref

PolyPublie